tun, conn, device: allocate buffers in the edge devices#49
Conversation
496dc1a to
61b6c72
Compare
ea6f82a to
e981b07
Compare
|
I ran some local, single TCP stream benchmarks w/iperf3 on Ubuntu 24.04 w/Intel i5-12400 CPUs. 4184faf (what's currently used by tailscale/tailscale): RSS is much improved w/lots of streams (-P 32): 1GB+ vs ~200MB |
472f369 to
74691ad
Compare
d7467f5 to
f357c5a
Compare
f357c5a to
3e6d836
Compare
eb07970 to
48e7cd2
Compare
c4597b0 to
ffd019f
Compare
ffd019f to
55464f0
Compare
There was a problem hiding this comment.
Pull request overview
This PR inverts packet-buffer ownership by introducing a device/buffer.Buffer wrapper (with optional recycling) and updating the TUN + UDP receive paths to allocate/initialize buffers at the I/O edges rather than in device.
Changes:
- Introduce
device/buffer(Buffer wrapper + default pooled backing arrays) and replace*[MaxMessageSize]byteownership in device queues withbuffer.Buffer. - Extract
WaitPoolinto a standalonewaitpoolpackage and update device pool wiring to use it. - Update
tun.Device.Readandconn.ReceiveFuncAPIs (and all platform implementations/tests) to use[]buffer.Bufferand return packet lengths viaBuffer.Dataslicing.
Reviewed changes
Copilot reviewed 41 out of 41 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| waitpool/waitpool.go | New capped sync.Pool wrapper package used outside device. |
| waitpool/waitpool_test.go | Updated tests for new package name + API (New) and atomic type. |
| waitpool/race_enabled_test.go | Package rename to waitpool. |
| waitpool/race_disabled_test.go | Package rename to waitpool. |
| tun/tun.go | Updates tun.Device.Read signature/documentation to []buffer.Buffer. |
| tun/tun_linux.go | Updates Linux TUN read path to allocate/resize buffer.Buffer and adapt virtio/GSO flow. |
| tun/tun_windows.go | Updates Windows TUN read path to use buffer.Buffer. |
| tun/tun_darwin.go | Updates Darwin TUN read path to use buffer.Buffer. |
| tun/tun_freebsd.go | Updates FreeBSD TUN read path to use buffer.Buffer. |
| tun/tun_openbsd.go | Updates OpenBSD TUN read path to use buffer.Buffer. |
| tun/tun_plan9.go | Updates Plan 9 TUN read path to use buffer.Buffer. |
| tun/tuntest/tuntest.go | Updates tuntest device read helper to the new buffer API. |
| tun/netstack/tun.go | Updates netstack TUN wrapper read implementation to use buffer.Buffer. |
| tun/offload.go | Updates GSOSplit signature to operate on []buffer.Buffer and set lengths via Data slicing. |
| tun/offload_test.go | Updates fuzz test to allocate []buffer.Buffer outputs and new GSOSplit signature. |
| tun/offload_linux_test.go | Updates virtio read tests to use []buffer.Buffer and infer sizes from len(Data). |
| device/buffer/buffer.go | New Buffer type with Claim/Release + recycler hook for pooled backings. |
| device/buffer/pool.go | New default buffer pool + EnsureAllocated helper. |
| device/buffer/pool_test.go | Benchmark for capped vs uncapped pool usage. |
| device/buffer/constants.go | Defines max backing size used by the buffer pool. |
| device/buffer/constants_default.go | Default platform values for max read size + pool cap. |
| device/buffer/constants_android.go | Android-specific max read size + pool cap. |
| device/buffer/constants_ios.go | iOS-specific max read size + pool cap (var). |
| device/buffer/constants_windows.go | Windows-specific max read size + pool cap. |
| device/constants.go | Replaces MaxSegmentSize-based sizing with buffer.MaxReadSize. |
| device/pools.go | Switches device pools to waitpool and removes message-buffer pool in favor of buffer. |
| device/device.go | Updates pool field types to *waitpool.WaitPool. |
| device/send.go | Refactors outbound pipeline to claim/release buffer.Buffer ownership per element. |
| device/receive.go | Refactors inbound pipeline to claim/release buffer.Buffer ownership per element. |
| device/channels.go | Updates queue draining to release buffer.Buffer instead of returning message buffers. |
| device/device_test.go | Updates fake TUN device signature to match new read API. |
| device/queueconstants_default.go | Removes preallocation-related constants now owned by device/buffer. |
| device/queueconstants_android.go | Removes preallocation-related constants now owned by device/buffer. |
| device/queueconstants_ios.go | Removes preallocation-related constants now owned by device/buffer. |
| device/queueconstants_windows.go | Removes preallocation-related constants now owned by device/buffer. |
| conn/conn.go | Updates ReceiveFunc type to accept []buffer.Buffer. |
| conn/conn_test.go | Updates test type declaration for new ReceiveFunc signature. |
| conn/bind_std.go | Updates std net bind receive path to write into buffer.Buffer. |
| conn/bind_std_test.go | Updates std net bind test harness to use buffer.Buffer. |
| conn/bind_windows.go | Updates Windows ring bind receive path to use buffer.Buffer. |
| conn/bindtest/bindtest.go | Updates bind test receive function to allocate/write into buffer.Buffer. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
7c2ff1b to
996c482
Compare
996c482 to
f41317e
Compare
f41317e to
4b0fb08
Compare
4b0fb08 to
1651d0b
Compare
0824a03 to
14fb57a
Compare
21b0c5d to
d42e7a5
Compare
| // BackingGo holds a Go-allocated backing object. Zero otherwise. | ||
| BackingGo unsafe.Pointer | ||
|
|
||
| // BackingExt holds an opaque address into off-heap memory (e.g. a | ||
| // WinRio ring slot, an AF_XDP region). Zero otherwise. | ||
| BackingExt uintptr |
There was a problem hiding this comment.
I prototyped a unified Backing uintptr. Losing Bytes in such case of course is catastrophic, but otherwise it's a minor edit to go back if we trust consuming devs enough.
d42e7a5 to
d08449b
Compare
Previously device maintained a batch of preallocated to the MaxMessageSize buffers that the I/O only needed to read into. This change inverts buffer ownership. A (wrapped) nil pointer is passed into I/O. Device expects a backing array to be allocated, and a slice of read data cut to offset+size, returned from read within the same wrapper. The wrapper is defined in iobuf.View, and carries the backing reference and an optional Recycler implementation to return the backing for reuse. I/O is encouraged to implement a buffer management solution that works best for it. A shared sync.Pool is provided as a default option. Updates tailscale/corp#36989 Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com> Change-Id: I58908d9d3fd09441e9378a74b0ee19136a6a6964
14fb57a to
dfe5c44
Compare
d08449b to
2b8b0b9
Compare
This is a part of series of changes.
Stacked on top of #54. This PR is concerned only with the allocation inversion and buffer recycling logic. #61 implements the right-sized allocations for the
bind, #62 for thetun.While sweeping, most of the diff is perfunctory threading of the new type through
device.Second largest after the API change concern may be the public View fields. If those fields were private, there'd be no hazard of BackingX diverging from the array. This was prototyped, and produced quite a lot of get/set helpers for various Bytes operations a-la gVisor. Keeping View entirely open brings non-insignificant ergonomic gains. I leave it up to the discussion, with my preference for trusting the device implementer and keeping things simple.
Previously device maintained a batch of preallocated to
the MaxMessageSize buffers that the I/O only needed to
read into.
This change inverts buffer ownership. A (wrapped) nil
pointer is passed into I/O. Device expects a backing
array to be allocated, and a slice of read data cut to offset+size,
returned from read within the same wrapper.
The wrapper is defined in iobuf.View, and carries
the backing reference and an optional Recycler implementation
to return the backing for reuse.
I/O is encouraged to implement a buffer management solution
that works best for it. A shared sync.Pool is provided as
a default option.
Updates tailscale/corp#36989